Using Multiple Speech Recognition Results to Enhance STD with Suffix Array on the NTCIR-10 SpokenDoc-2 Task
نویسندگان
چکیده
We have previously proposed a fast spoken term detection method that uses a suffix array as a data structure. By applying dynamic time warping on a suffix array, we achieved very quick keyword detection from a very large-scale speech document. In this study, we modify our method so that it can deal with multiple recognition results. By using these results obtained from various speech recognizers, search performance will improve as a consequence of the complementary effect of using different language and acoustic models. Experimental results show the maximum value of F-measure and the MAP score increased by 6% to 10%.
منابع مشابه
Utilization of Suffix Array for Quick STD and Its Evaluation on the NTCIR-9 SpokenDoc Task
We propose a technique for detecting keywords quickly from a very large speech database without using a large-sized memory. For acceleration of search and saving the use of memory, we employed a suffix array as a data structure and applied phonemebased DP-matching to it. To avoid exponential explosion of process time with the length of a keyword, a long keyword is divided into short sub-keyword...
متن کاملSTD and SCR Techniques and Their Evaluations on the NTCIR-10 SpokenDoc-2 Task
This paper describes spoken term detection (STD) and spoken contents retrieval (SCR) techniques and their evaluations at the NTCIR-10 SpokenDoc-2 task. First of all, we describes our STD technique using a phoneme transition network (PTN) derived from multiple speech recognizers’ outputs and its evaluations at the STD and the iSTD (inexistent STD) tasks. Next, we introduce our SCR technique usin...
متن کاملSTD Method Based on Hash Function for NTCIR11 SpokenQuery&Doc Task
In this paper, we describe a spoken term detection (STD) method which is used in Spoken Query and Documents task of NTCIR-11 meeting. Our STDmethod extracts sub-sequences from the syllable-based speech recognition candidates of the target speech and converts them into bit sequences using a hash function. The query is also converted into a bit sequence in the same way. Term detection candidates ...
متن کاملOverview of the NTCIR-10 SpokenDoc-2 Task
This paper describes an overview of the IR for Spoken Documents Task in NTCIR-10Workshop. In this task, the spoken term detection (STD) subtask and ad-hoc spoken content retrieval subtask (SCR) are conducted. Both of the tasks target to search terms, passages and documents included in academic oral presentations. This paper explains the data used in the tasks, how to make transcriptions by spee...
متن کاملSpoken Term Detection Using Multiple Speech Recognizers' Outputs at NTCIR-9 SpokenDoc STD subtask
This paper describes spoken term detection (STD) with false detection control using a phoneme transition network (PTN) derived frommultiple speech recognizers’ outputs at NTCIR9 SpokenDoc STD subtask. Using the output of multiple speech recognizers, the PTN method is effective at correctly detecting out-of-vocabulary (OOV) terms and is robust to certain recognition errors. However, it exhibits ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013